Near ML decoding method based on metric-first search and branch length threshold

ABSTRACT

In this invention, we propose a near maximum likelihood (ML) method for the decoding of multiple input multiple output systems. By employing the metric-first search method, Schnorr-Euchner enumeration, and branch length thresholds in a single frame systematically, the proposed technique provides a higher efficiency than other conventional near ML decoding schemes. From simulation results, it is confirmed that the proposed method has lower computational complexity than other near ML decoders while maintaining the bit error rate (BER) very close to the ML performance. The proposed method in addition possesses the capability of allowing flexible tradeoffs between the computational complexity and BER performance.

FIELD OF THE INVENTION

The present invention relates generally to wireless communicationsystems, and more particularly to decoding method for use in multipleinput multiple output (MIMO) wireless communication systems.

BACKGROUND OF THE INVENTION

It is well known that MIMO systems can provide high spectral efficiencycompared to single input single output (SISO) systems for wirelesscommunications. The MIMO system is considered one of the principaltechnologies for the next generation mobile communication because noadditional bandwidth or transmit power is required to increase thecapacity of the system. As in the case of SISO systems, several decodingschemes have been considered in many studies for the decoding of MIMOsystems. Among the decoding schemes for MIMO systems, the maximumlikelihood (ML) decoder provides the optimal bit error rate (BER)performance at the expense of quite severe computational requirement.

The sphere decoder (SD) has been introduced as an interesting means toreduce the excessive computational complexity of the conventionalfull-search ML decoder. The breadth-first signal decoder (BSIDE) hasrecently been proposed and shown to have lower computational complexitythan the SD in general. Despite various studies for designing MLdecoders with a reduced computational complexity, however, thecomputational complexity of the ML decoders is still somewhat higherthan that of practical systems.

In a number of studies, several near ML decoders with a reasonable lossin the BER performance have been proposed to achieve lower computationalcomplexity for the decoding of MIMO systems. Most of the near MLdecoders perform QR decomposition (QRD) of the channel matrix and regardthe decoding problem as a problem of searching for a lattice point withthe smallest node metric by employing the depth-, breadth-, ormetric-first search method on a tree.

Among a variety of near ML decoders, Schnorr-Euchner2 (SE2) scheme andincreasing radii algorithm (IRA) have been proposed to alleviate theexponentially growing computational complexity of the depth-first searchwhen the number of layers increases. As variants of the SD, the SE2 andIRA both employ unique methods for the determination of the thresholdand repeat searching the tree back and forth to find a node with thesmallest metric. The SE2 scheme reduces the computational complexitybased on Schnorr-Euchner (SE) enumeration with a Fano-like metric biasand an early termination technique. On the other hand, the IRA reducesthe computational complexity by pruning the search space statistically,offering substantial computational savings when the number of antennasis large. Although the SE2 and IRA both achieve near ML performance withlow computational complexity, the SE2 requires an estimate of the signalto noise ratio (SNR) and the IRA is required to restart the search fromthe beginning with an increased radius when no feasible point is found.

As for the decoding schemes based on the breadth-first search method,the QRD-M scheme is based on the classical M-algorithm and exhibitsquite a low computational complexity, searching the tree only in the‘forward’ direction from the root of a tree. In order to prevent a fullsearch of the tree, the QRD-M retains only M nodes with the smallestnode metric in each layer. The adaptive QRD-M and efficient QRD-Mschemes have also been proposed to further reduce the computationalcomplexity of the QRD-M. The efficient QRD-M achieves a reduction in thecomputational complexity by discarding a node when the node metric islarger than a threshold: however, the partial decision feedbackequalizer (DFE) solution and the Euclidean distance of the DFE solutionneed to be computed in each layer.

Based on the metric-first search, the QRD-Stack scheme relies on thestack algorithm and searches branches extended from a node with thesmallest node metric. Although the QRD-Stack allows a low computationalcomplexity by retaining only a few nodes for search, backtracking isquite frequent at low SNR since a number of nodes, not necessarily inthe same layer, are considered simultaneously.

SUMMARY OF THE INVENTION

In this invention, we propose a near ML decoding method, called decodingwith expected length and threshold approximated (DELTA). The DELTA is anovel scheme incorporating the metric-first search, SE enumeration, andbranch length threshold in a single frame systematically for thedecoding of MIMO systems. A novel method obtaining the branch lengththreshold is also proposed in this invention. The proposed threshold isa measure of the expected length of the paths from the parent node of abest node to a node in the first layer, and has the distinctcharacteristic that the threshold can be obtained by using the channelmatrix only. Based on the metric-first search and by employing thebranch length threshold and SE enumeration, the DELTA searches the treein the unique way not considered before. Specifically, the DELTA (1)finds a node with the smallest node metric in the tree, (2) determinesif the node deserves to be searched, and (3) connects, from the node andits parent node, one branch each at a time. Therefore, by avoidingunnecessary backtracking and connections of nodes during the search, theDELTA provides a lower computational complexity than other near MLdecoders, especially when the SNR is low. The DELTA in addition allowsflexible tradeoffs between the computational complexity and BERperformance.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a block diagram of the MIMO system with N_(T) transmit andN_(R) receive antennas.

FIG. 2 is an example of the tree when m=4 and L=4.

FIG. 3 is a flow chart of the DELTA.

FIG. 4 is an example of the tree searching with the DELTA.

FIG. 5 shows some characteristics of various near ML decoding schemes.

FIG. 6 shows the BER performance of the DELTA for various values of N in16- and 64-QAM when N_(T)=N_(R)=2.

FIG. 7 shows the average number of multiplications of the DELTA forvarious values of N in 16- and 64-QAM when N_(T)=N_(R)=2.

FIG. 8 shows the BER performance of the DELTA for various values of N in16- and 64-QAM when N_(T)=N_(R)=4.

FIG. 9 shows the average number of multiplications of the DELTA forvarious values of N in 16- and 64-QAM when N_(T)=N_(R)=4.

FIG. 10 shows the BER performance of various MIMO decoding schemes in16- and 64-QAM when N_(T)=N_(R)=2 and 4.

FIG. 11 shows the average number of multiplications of various MIMOdecoding schemes in 16-QAM when N_(T)=N_(R)=2.

FIG. 12 shows the average number of multiplications of various MIMOdecoding schemes in 64-QAM when N_(T)=N_(R)=2.

FIG. 13 shows the average number of multiplications of various MIMOdecoding schemes in 16-QAM when N_(T)=N_(R)=4.

FIG. 14 shows the average number of multiplications of various MIMOdecoding schemes in 64-QAM when N_(T)=N_(R)=4.

FIG. 15 shows the BER performance of the DELTA for various values of αin 16-QAM when N_(T)=N_(R)=4.

FIG. 16 shows the average number of multiplications of the DELTA forvarious values of α in 16-QAM when N_(T)=N_(R)=4.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 shows a block diagram of the MIMO system with N_(T) transmit andN_(R) receive antennas. We assume that the data stream is demultiplexedinto N_(T) sub-streams and then sent simultaneously from the N_(T)transmit antennas to the N_(R) receive antennas over a rich-scattering,flat fading wireless channel. It is also assumed that a commonquadrature amplitude modulation (QAM) is employed for all thesub-streams. Then, denoting by {tilde over (y)}_(j) the complex signalreceived at the j-th receive antenna, the discrete-time baseband modelof the received signal vector {tilde over (y)}=[{tilde over (y)}₁,{tilde over (y)}₂, . . . , {tilde over (y)}_(N) _(R) ]^(T) can beexpressed as{tilde over (y)}={tilde over (H)}{tilde over (s)}+{tilde over (v)},  (1)where the superscript T indicates the vector transpose, {tilde over (H)}is the N_(R)×N_(T) channel matrix of independent and identicallydistributed (i.i.d.) complex Gaussian random variables with mean zeroand unit variance, {tilde over (s)}=[{tilde over (s)}₁, {tilde over(s)}₂, . . . , {tilde over (s)}_(N) _(T) ]^(T) is the transmitted signalvector, and {tilde over (v)}=[{tilde over (v)}₁, {tilde over (v)}₂, . .. , {tilde over (v)}_(N) _(R) ]^(T) is the vector of i.i.d. complexadditive Gaussian random variables with mean zero and variance σ². Weassume that the estimation of the channel matrix {tilde over (H)} hasbeen completed before the decoding at the receiver.

Denoting by

(•) and

(•) the real and imaginary parts, respectively, the complex basebandmodel (1) can be transformed into a real representation as

$\begin{matrix}\begin{matrix}{\underset{\_}{y} = \begin{pmatrix}{\left( \underset{\_}{\overset{\sim}{y}} \right)} \\{\left( \overset{\sim}{\underset{\_}{y}} \right)}\end{pmatrix}} \\{= {{\begin{pmatrix}{\left( \overset{\sim}{H} \right)} & {- \left( \overset{\sim}{H} \right)} \\{\left( \overset{\sim}{H} \right)} & {\left( \overset{\sim}{H} \right)}\end{pmatrix}\begin{pmatrix}{\left( \overset{\sim}{\underset{\_}{s}} \right)} \\{\left( \overset{\sim}{\underset{\_}{s}} \right)}\end{pmatrix}} + \begin{pmatrix}{\left( \overset{\sim}{\underset{\_}{v}} \right)} \\{\left( \overset{\sim}{\underset{\_}{v}} \right)}\end{pmatrix}}} \\{= {{H\underset{\_}{s}} + {\underset{\_}{v}.}}}\end{matrix} & (2)\end{matrix}$In (2), y=[y₁, y₂, . . . , y_(n)]^(T) is the real received signalvector, s=[s₁, s₂, . . . , s_(m)]^(T) is the real transmitted signalvector, and v=[v₁, v₂, . . . , v_(n)]^(T) is the vector of real i.i.d.additive Gaussian noise with mean zero and variance σ²/2, wherem=2N_(T)   (3)andn=2N_(R).   (4)For simplicity, it is assumed in this invention that n=m without loss ofgenerality.

Let us first QR decompose the channel matrix H, where Q is an m×munitary matrix such thatQ^(T)Q=I   (5)and R=[r_(i,j)] is an m×m upper triangular matrix. Multiplying bothsides of (2) by Q^(T), we have

$\begin{matrix}{{\underset{\_}{r} = {{R\underset{\_}{s}} + \underset{\_}{w}}},{where}} & (6) \\\begin{matrix}{\underset{\_}{r} = {Q^{T}\underset{\_}{y}}} \\{= \left\lbrack {r_{1},r_{2},\ldots\mspace{14mu},r_{m}} \right\rbrack^{T}}\end{matrix} & (7)\end{matrix}$and w=Q^(T) v. Note that the statistical properties of the noise term win (6) are the same as those of v in (2) because of (5).

Assuming that we consider the signal constellation

$\begin{matrix}{{??} = \left\{ {{- \frac{\sqrt{L} - 1}{2}},{- \frac{\sqrt{L} - 3}{2}},\ldots\mspace{14mu},\frac{\sqrt{L} - 3}{2},\frac{\sqrt{L} - 1}{2}} \right\}} & (8)\end{matrix}$of L-QAM with L=4, 16, . . . , the set {Rs} of vectors in (6) is asubset of the infinite latticeΛ(R)={R s : s εA_(∞) ^(m)}  (9)generated by R, where

$\begin{matrix}{{??}_{\infty} = \left\{ {{a + {\frac{1}{2}\text{:}\mspace{14mu} a}} \in {\mathbb{Z}}} \right\}} & (10)\end{matrix}$is an infinite augmentation of A with

denoting the set of all integers. Then, the vector r of received signalscan be considered as a perturbed lattice point due to the noise w.Therefore, given the vector r and matrix R, the optimal solution{circumflex over (s)} is obtained as

$\begin{matrix}\begin{matrix}{\hat{\underset{\_}{s}} = {\arg\;{\min\limits_{\underset{\_}{s} \in {??}^{m}}{{\underset{\_}{r} - {R\underset{\_}{s}}}}^{2}}}} \\{{= {\arg\;{\min\limits_{\underset{\_}{s} \in {??}^{m}}{\sum\limits_{i = 1}^{m}\left( {r_{i} - {\sum\limits_{j = i}^{m}{r_{i,j}s_{j}}}} \right)^{2}}}}},}\end{matrix} & (11)\end{matrix}$where ∥•∥ denotes the Euclidean norm.

Exploiting the upper triangular property of the matrix R, the treestructure is used quite frequently to find the ML or near ML solution inthe decoding of MIMO systems. Let us consider a √{square root over(L)}-ary tree with m+1 layers stemmed from a root located in the(m+1)-st layer, the highest layer. Then, a branch between the (k+1)-stand k-th layers of the tree denotes a possible value (εA) of the k-thelement s_(k) of the real transmitted signal vector s, and a node of thetree denotes the vector of the branches in the unique path connectingthe node and root. We will denote the l-th node in the k-th layer by the(m−k+1)-dimensional vectors _(k) ^((l)) =[s _(k,k) ^((l)) , s _(k+1,k) ^((l)) , . . . , s _(m,k)^((l))]^(T)   (12)for k=1,2,. . . ,m and l=1,2, . . . ,√{square root over (L)}^(m−k+1),with the root denoted by s _(m+1) ⁽¹⁾ for convenience. An example of thetree for m=4 and L=4 is shown in FIG. 2.

Let us define the node metric of a node as the sum of the lengths(metrics) of the branches of the unique path connecting the node androot. Specifically, defining the length φ(s _(k) ^((l))) of the branchbetween a node s _(k) ^((l)) and its parent node

$\begin{matrix}{\begin{matrix}{{\underset{\_}{s}}_{k + 1}^{(p)} = \left\lbrack {s_{{k + 1},{k + 1}}^{(p)},s_{{k + 2},{k + 1}}^{(p)},\ldots\mspace{14mu},s_{m,{k + 1}}^{(p)}} \right\rbrack^{T}} \\{= \left\lbrack {s_{{k + 1},k}^{(l)},s_{{k + 2},k}^{(l)},\ldots\mspace{14mu},s_{m,k}^{(l)}} \right\rbrack^{T}}\end{matrix}{as}} & (13) \\\begin{matrix}{{\varphi\left( {\underset{\_}{s}}_{k}^{(l)} \right)} = \left( {r_{k} - {\sum\limits_{j = k}^{m}{r_{k,j}s_{j,k}^{(l)}}}} \right)^{2}} \\{{= {e_{k}^{2}\left( {\underset{\_}{s}}_{k}^{(l)} \right)}},}\end{matrix} & (14)\end{matrix}$the node metric Φ(s _(k) ^((l))) of s _(k) ^((l)) can be obtained as

$\begin{matrix}{\begin{matrix}{{\Phi\left( {\underset{\_}{s}}_{k}^{(l)} \right)} = {\sum\limits_{l = k}^{m}{\varphi\left( {\underset{\_}{s}}_{i}^{(p_{i})} \right)}}} \\{= {\sum\limits_{i = k}^{m}\left( {r_{i} - {\sum\limits_{j = i}^{m}{r_{i,j}s_{j,i}^{(p_{i})}}}} \right)^{2}}} \\{= {\sum\limits_{i = k}^{m}\left( {r_{i} - {\sum\limits_{j = i}^{m}{r_{i,j}s_{j,k}^{(l)}}}} \right)^{2}}} \\{{= {\sum\limits_{i = k}^{m}{e_{i}^{2}\left( {\underset{\_}{s}}_{k}^{(l)} \right)}}},}\end{matrix}{where}} & (15) \\{{e_{i}\left( {\underset{\_}{s}}_{k}^{(l)} \right)} = {r_{i} - {\sum\limits_{j = {\max{({i,k})}}}^{m}{r_{i,j}s_{j,k}^{(l)}}}}} & (16)\end{matrix}$and s _(i+1) ^((p) ^(i+1) ⁾ is the parent node of s _(i) ^((p) ^(i) ⁾for i=k,k+1, . . . ,m with p_(k)=l and p_(m+1)=1. Note that we have useds_(j,i) ^((p) ^(i) ⁾=s_(j,k) ^((l))   (17)for i=k,k+1, . . . , m and j=i,i+1, . . . , m in obtaining the thirdline from the second line of (15).

It is straightforward to see that the problem described by (11) offinding the optimal solution {circumflex over (s)}εA^(m) is equivalentto the problem of finding the node s ₁ ^((l)) with the smallest nodemetric Φ(s ₁ ^((l))) among the vectors

$\begin{matrix}\left\{ {{\underset{\_}{s}}_{1}^{(1)},{\underset{\_}{s}}_{1}^{(2)},\ldots\mspace{14mu},{\underset{\_}{s}}_{1}^{({\sqrt{L}}^{m})}} \right\} & (18)\end{matrix}$in the first layer of the tree.

Let us make some definitions before we describe the proposed decodingscheme, the DELTA, in detail.

-   -   Leaf node: a node not connected to any of the nodes in lower        layers.    -   Deepest node: a node in the lowest layer among the leaf nodes.    -   Best node: a node with the smallest node metric among the leaf        nodes.    -   Best branch: the branch of a node with the smallest length among        the branches not considered previously.        Clearly, there may exist two or more deepest nodes at any        instant during the search over a tree. On the other hand, it is        straightforward to see that the best branch of a node and the        best node are unique with probability one.

In the metric-first search methods, the best node is determined amongthe leaf nodes not necessarily in the same layer, and then branches areconnected from the best node to the nodes in the layer immediatelybelow, making new leaf nodes. The algorithm continues until a best nodein the first layer is found. The metric-first search methods generallyoffer good performance with moderate computational complexity when theSNR is high, but exhibit high computational complexity when the SNR islow because of frequent backtracking. In addition, when a best node isfound, all branches from the best node are connected, resulting inunnecessary connection and consideration of nodes. The computationalcomplexity therefore increases considerably as the size of signalconstellation and number of antennas increase.

The number of branches considered from a best node can be reduced byconnecting only one branch at a time starting from the best branch,which can be made possible by utilizing the SE enumeration frequentlyemployed to improve the efficiency of the tree search. Specifically, letQ(•) denote the quantization of (•) to the nearest element in the set A:for example, Q(−0.2)=−0.5 and Q(2.1)=1.5 when A={−1.5,−0.5,0.5,1.5}.Then, after obtaining the best branch S(s _(k) ^((l))) of s _(k) ^((l))as

$\begin{matrix}{{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} = {{??}\left( \frac{e_{k - 1}\left( {\underset{\_}{s}}_{k}^{(l)} \right)}{r_{{k - 1},{k - 1}}} \right)}},} & (19)\end{matrix}$the branches {B_(j)(s _(k) ^((l)))}_(j=1) ^(√{square root over (L)})froms _(k) ^((l)) to nodes in layer k−1 can be arranged in the ascendingorder of the branch lengths as

$\begin{matrix}{{\left\{ {{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} + 1},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} - 1},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} + 2},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} - 2},\ldots}\mspace{11mu} \right\}\bigcap{??}},\mspace{20mu}{{{if}\mspace{14mu}{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)}} \leq \frac{e_{k - 1}\left( {\underset{\_}{s}}_{k}^{(l)} \right)}{r_{{k - 1},{k - 1}}}},{\left\{ {{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} - 1},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} + 1},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} - 2},{{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)} + 2},\ldots} \right\}\bigcap{??}},\mspace{20mu}{{{if}\mspace{14mu}{S\left( {\underset{\_}{s}}_{k}^{(l)} \right)}} > \frac{e_{k - 1}\left( {\underset{\_}{s}}_{k}^{(l)} \right)}{r_{{k - 1},{k - 1}}}}} & (20)\end{matrix}$with the SE enumeration. In essence, employing the re-arrangement by theSE enumeration, the branches can be considered in a more systematic wayand the probability of searching only promising branches is maximized.

Based on the metric-first search and SE enumeration, the DELTA searchesa tree starting from the root. Specifically, as an advanced variant ofthe metric-first search method, the DELTA (1) finds a best node in thetree, (2) determines if the best node deserves to be searched, and (3)considers one branch at a time starting from the best branch of the bestnode. More specifically, the DELTA can be described by the three stepsbelow.

Step 1: Determining the Best Node

The metrics of all the leaf nodes, not necessarily in the same layer,are compared and the best node is selected.

Step 2: Checking the Layer of the Best Node

If the best node determined in Step 1 is a deepest node, we take Step2-1. If the best node is not a deepest node, on the other hand, we takeStep 2-2.

(A) Step 2-1. Checking if the Best Node is in the First Layer

If the best node is in the first layer, the best node is declared thesolution and the search is terminated because, with probability one, anyother node not searched yet has a node metric larger than or equal tothat of the best node. On the other hand, if the best node is not in thefirst layer, we take Step 3, continuing the search by making a new leafnode in the layer immediately below.

(B) Step 2-2: Determining if the Best Node Deserves to be Searched

When the best node is not a deepest node, to reduce the computationalcomplexity incurred from searching the same layers unnecessarily, wedetermine if the best node deserves further consideration by taking intoaccount the expected length to the first layer. Specifically, when thebest node is s _(k) ^((l)), we compare the length φ(s _(k) ^((l))) ofthe branch between s _(k) ^((l)) and its parent node with the threshold

$\begin{matrix}{\gamma_{k}^{2} = {\left\{ {\frac{\Gamma\left( {\frac{k}{2} + 1} \right)}{\pi^{\frac{k}{2}}}{\prod\limits_{i = 1}^{k}\;{r_{i,i}}}} \right\}^{\frac{2}{k}}.}} & (21)\end{matrix}$If φ(s _(k) ^((l)))≦γ_(k) ², we regard s _(k) ^((l)) deserves furtherconsideration and take Step 3. If φ(s _(k) ^((l)))>γ_(k) ², on the otherhand, we regard s _(k) ^((l)) does not deserve further consideration:consequently, we discard s _(k) ^((l)) and then return to Step 1. Thisprocedure keeps us from unnecessary search and results in considerablereduction of computational complexity.

Before we delineate in detail the third step of the DELTA, let usexplain the rationale behind the threshold γ_(k) ² shown in (21).Interpreting physically, the threshold γ_(k) ² is a measure of theexpected length of (i.e., the expected value of the sum of the lengthsof the segments along) the paths from the parent node of a best node inlayer k to a node in the first layer as we shall see shortly: taking theexpected length into account, the threshold is derived as follows.

Basically, we are interested in the shortest path among all the pathsfrom the parent node to a node in the first layer and expect the bestnode to be in the shortest path. Consider the parent node s _(k+1)^((p)) of the best node s _(k) ^((l)) and a nodes ₁ ^((f)) =[s _(1,1) ^((f)) , s _(2,1) ^((f)) , . . . , s _(m,1)^((f))]^(T)   (22)in the first layer of the shortest path, wheres _(j,1) ^((f)) s _(j,k) ^((l))   (23)for j=k+1,k+2, . . . , m. Since the length of a branch is nonnegative asis apparent from (14), if s _(k) ^((l)) is in the shortest path, thelength L_(k+1,1) of the shortest path connecting s _(k+1) ^((p)) and s ₁^((f)) should be larger than or equal to the length φ(s _(k) ^((l))between s _(k) ^((l)) and s _(k+1) ^((p)): that is, we haveφ( s _(k) ^((l)))≦L_(k+1,1).  (24)Now, using (23), the length L_(k+1,1) can be rewritten as

$\begin{matrix}\begin{matrix}{\mathcal{L}_{{k + 1},1} = {\sum\limits_{i = 1}^{k}\left( {r_{i} - {\sum\limits_{j = i}^{m}{r_{i,j}s_{j,1}^{(f)}}}} \right)^{2}}} \\{= {\sum\limits_{i = 1}^{k}\left( {r_{i} - {\sum\limits_{j = {k + 1}}^{m}{r_{i,j}s_{j,k}^{(l)}}} - {\sum\limits_{j = i}^{k}{r_{i,j}s_{j,1}^{(f)}}}} \right)^{2}}} \\{= {\sum\limits_{i = 1}^{k}\left( {{e_{i}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)} - {\sum\limits_{j = i}^{k}{r_{i,j}s_{j,1}^{(f)}}}} \right)^{2}}} \\{{= {{{\underset{\_}{r}}_{k}^{\prime} - {R_{k}{\underset{\_}{s}}_{k}}}}^{2}},}\end{matrix} & (25)\end{matrix}$implying that we can interpret L_(k+1,1) as the square of the distancebetween a transformed received signal vector r _(k)′ and a lattice pointR_(k) s _(k) in the k-dimensional latticeΛ_(k)(R _(k))={R _(k) s _(k) :s _(k)εA^(k)}  (26)generated by the submatrix

$\begin{matrix}{{R_{k} = \begin{pmatrix}r_{1,1} & r_{1,2} & \ldots & r_{1,k} \\0 & r_{2,2} & \ldots & r_{2,k} \\\vdots & \vdots & \ddots & \vdots \\0 & 0 & \ldots & r_{k,k}\end{pmatrix}}{{{of}\mspace{14mu} R},{where}}} & (27) \\{{{\underset{\_}{r}}_{k}^{\prime} = \left\lbrack {{e_{1}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)},{e_{2}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)},\ldots\mspace{14mu},{e_{k}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)}} \right\rbrack^{T}}{and}} & (28) \\{{\underset{\_}{s}}_{k} = {\left\lbrack {s_{1,1}^{(f)},s_{2,1}^{(f)},\ldots\mspace{14mu},s_{k,1}^{(f)}} \right\rbrack^{T}.}} & (29)\end{matrix}$Note that L_(k+1,1) is the smallest value among all the lengths of pathsfrom s _(k+1) ^((p)) to a node in the first layer. Apparently, thesmallest value is obtained when R_(k) s _(k) is the closest latticepoint, among all the lattice points in the lattice Λ_(k)(R_(k)), from r_(k)′. In other words, to have the smallest value for L_(k+1,1), r _(k)′should be included in the Voronoi regionν(Λ_(k)(R _(k)), s _(k))={ r _(k) ′ε

:∥r _(k) ′−R _(k) s _(k) ∥≦∥r _(k) ′−R _(k) {tilde over (s)} _(k) ∥, ∀R_(k) {tilde over (s)} _(k)εΛ_(k)(R _(k))}  (30)of the lattice point R_(k) s _(k), where

denotes the set of real numbers. Note that the Voronoi region (30)denotes the set of all vectors r _(k)′ closer to R_(k) s _(k) than toany other lattice point in Λ_(k)(R_(k)).

Next, as the exact boundary of a Voronoi region is unfortunately ratherimpossible to describe succinctly in most cases, let us approximate thek-dimensional Voronoi region ν(Λ_(k)(R_(k)), s _(k)) by thek-dimensional hypersphere centered at R_(k) s _(k) with the same volume.Here, because the volume of the Voronoi region of a lattice point in afinite lattice varies depending on the location of the lattice point, weevaluate the volume of the Voronoi region of R_(k) s _(k) in theinfinite lattice Λ(R_(k)) instead of that in the finite latticeΛ_(k)(R_(k)): in other words, we assume

$\begin{matrix}{{{{Vol}\left( {{??}\left( {{\Lambda_{k}\left( R_{k} \right)},{\underset{\_}{s}}_{k}} \right)} \right)} = {{Vol}\left( {{??}\left( {{\Lambda\left( R_{k} \right)},{\underset{\_}{s}}_{k}} \right)} \right)}},{where}} & (31) \\\begin{matrix}{{{Vol}\left( {{??}\left( {{\Lambda\left( R_{k} \right)},{\underset{\_}{s}}_{k}} \right)} \right)} = \sqrt{\det\left( {R_{k}^{T}R_{k}} \right)}} \\{= {\prod\limits_{i = 1}^{k}\;{r_{i,i}}}}\end{matrix} & (32)\end{matrix}$for all s _(k). Then, equating

$\prod\limits_{i = 1}^{k}\;{r_{i,i}}$with the volume

$\begin{matrix}{\frac{\pi^{k/2}}{\Gamma\left( {{k/2} + 1} \right)}d^{k}} & (33)\end{matrix}$of the k-dimensional hypersphere with radius d, we get

$\quad\begin{matrix}\begin{matrix}{d = \left( {\frac{\Gamma\left( {\frac{k}{2} + 1} \right)}{\pi^{\frac{k}{2}}}{\prod\limits_{i = 1}^{k}\;{r_{i,i}}}} \right)^{\frac{1}{k}}} \\{= {\gamma_{k}.}}\end{matrix} & (34)\end{matrix}$In short, the threshold is the radius squared of the hypersphereobtained by approximating the Voronoi region of the closest latticepoint from r _(k)′ as the k-dimensional hypersphere with the samevolume.

Now, for any point r _(k)′ in the Voronoi region approximated by thehypersphere, we have∥ r _(k) ′−R _(k) s _(k)∥² ≦d ².   (35)Together with (24) and (25), inequality (35) can be used to obtainφ( s _(k) ^((l)))≦γ_(k) ²   (36)as a condition for s _(k) ^((l)) to satisfy in order to have any‘margin’ in length to the first layer. In other words, the result (36)implies that, only when (36) is satisfied, the path from the root to s_(k) ^((l)) is short enough for s _(k) ^((l)) to deserve further searchor consideration. If φ(s _(k) ^((l)))>γ_(k) ², on the other hand, s _(k)^((l)) is not in the shortest path, which means that s _(k) ^((l)) doesnot deserve further consideration.Step 3: Adding New Leaf Nodes

Again, let us denote the best node passed through Step 2 by s _(k)^((l)) and the parent node of s _(k) ^((l)) by s _(k+1) ^((p)). Wedecide the best branch S(s _(k) ^((l))) of s _(k) ^((l)) with (19) andconnect s _(k) ^((l)) tos _(k−1) ^((c)) =[S( s _(k) ^((l))), s _(k) ^((l))]^(T)   (37)in the (k−1)-st layer. Then, we compute the branch lengthφ( s _(k−1) ^((c)))={e _(k−1)( s _(k) ^((l)))−r _(k−1,k−1) s _(k−1,k−1)^((c))}²   (38)to obtain the node metric Φ(s _(k−1) ^((c))) of s _(k−1) ^((c)) asΦ( s _(k−1) ^((c)))=Φ( s _(k) ^((l)))+φ( s _(k−1) ^((c))),   (39)where Φ( s _(k) ^((l))) has previously been computed in the k-th layer.

(A) Step 3-1: When There Remains No Branch from the Parent Node

If s _(k) ^((l)) is the root or if all the branches of s _(k+1) ^((p))have already been connected, we take Step 1.

(B) Step 3-2: When There Remains at Least One Branch from the ParentNode

On the other hand, if s _(k) ^((l)) is not the root and s _(k+1) ^((p))has at least one branch not connected yet, we connect the best branchfrom s _(k+1) ^((p)) to form a new leaf node, and then discard a nodewith the largest metric if necessary.

Specifically, let us first define the (j+1)-st ‘branch increment’ of s_(k+1) ^((p)) by

$\begin{matrix}{{\Delta_{j + 1}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)} = \left\{ {\begin{matrix}{{{sgn}\left( {\frac{e_{k}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)}{r_{k,k}} - s_{k,k}^{(l)}} \right)},} & {{{{when}\mspace{14mu} j} = 0},} \\{{{- {\Delta_{j}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)}} - {{sgn}\left( {\Delta_{j}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)} \right)}},} & {{{{when}\mspace{14mu} j} = 1},2,\ldots\mspace{14mu},}\end{matrix}\mspace{79mu}{where}} \right.} & (40) \\{\mspace{79mu}{{{sgn}(x)} = \left\{ \begin{matrix}{1,} & {{x \geq 0},} \\{{- 1},} & {x < 0.}\end{matrix} \right.}} & (41)\end{matrix}$Then, when b branch increments of s _(k+1) ^((p)) have already beencomputed, the best branch S(s _(k+1) ^((p))) of s _(k+1) ^((p)) isobtained as

$\begin{matrix}{{S\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)} = \left\{ \begin{matrix}{{s_{k,k}^{(l)} + {\Delta_{b + 1}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)}},} & {{{{{if}\mspace{14mu} s_{k,k}^{(l)}} + {\Delta_{b + 1}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)}} \in A},} \\{{s_{k,k}^{(l)} + {\Delta_{b + 1}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)} + {\Delta_{b + 2}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)}},} & {{{{if}\mspace{14mu} s_{k,k}^{(l)}} + {\Delta_{b + 1}\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)}} \notin {A.}}\end{matrix} \right.} & (42)\end{matrix}$(Note that we can compute the increments recursively with only onestoring element. In addition, if we compute Δ₁(s _(k+1) ^((p))) rightafter we have obtained s _(k+1) ^((p)) using (19), we can avoid thedivision of e_(k)(s _(k+1) ^((p)))/r_(k,k) saving some computation.)Then, we connect s _(k+1) ^((p)) withs _(k) ^((l′)) =[S( s _(k+1) ^((p))), s _(k+1) ^((p))]^(T)   (43)and compute the branch lengthφ( s _(k) ^((l′)))={e _(k)( s _(k+1) ^((p)))−r _(k,k) s _(k,k) ^((l′))}²  (44)of s _(k) ^((l′)) to obtain the node metric Φ(s _(k) ^((l′))) of s _(k)^((l′)) as

$\quad\begin{matrix}\begin{matrix}{{\Phi\left( {\underset{\_}{s}}_{k}^{(l^{\prime})} \right)} = {{\Phi\left( {\underset{\_}{s}}_{k + 1}^{(p)} \right)} + {\varphi\left( {\underset{\_}{s}}_{k}^{(l^{\prime})} \right)}}} \\{= {{\Phi\left( {\underset{\_}{s}}_{k}^{(l)} \right)} - {\varphi\left( {\underset{\_}{s}}_{k}^{(l)} \right)} + {{\varphi\left( {\underset{\_}{s}}_{k}^{(l^{\prime})} \right)}.}}}\end{matrix} & (45)\end{matrix}$Note that e_(k)(s _(k+1) ^((p))) in (44) has already been computed andstored when S(s _(k+1) ^((p))) was obtained using (19), and Φ(s _(k)^((l))) and φ(s _(k) ^((l))) in (45) have already been computed andstored when s _(k) ^((l)) was connected with s _(k+1) ^((p)). Therefore,we can compute Φ(s _(k) ^((l′))) without storing Φ(s _(k+1) ^((p))).Next, if there are more than N leaf nodes (in which case, the number ofleaf nodes ought to be N+1), we retain only N leaf nodes including s_(k−1) ^((c)) by discarding a node with the largest node metric amongthe leaf nodes not taking s _(k−1) ^((c)) into account. Then, we returnto Step 1.

A flow chart of the DELTA and an example depicting the application ofthe DELTA when m=4, L=16, and N=16 are shown in FIGS. 3 and 4,respectively. In FIG. 4, the number inside a node denotes the order ofthe formation of the node.

By employing the metric-first search method, the branch lengththreshold, and SE enumeration, the DELTA avoids unnecessary connectionsof nodes and backtracking as much as possible. Therefore, the DELTA canoffer significant reduction in the computational complexity compared toother near ML schemes. The QRD-Stack is also based on the metric-firstsearch as the DELTA, but does not make use of a threshold and the SEenumeration. When a best node is determined, the QRD-Stack connects allbranches from the best node while the DELTA connects only the bestbranches from the best and its parent nodes if the best node deservesfurther consideration. The efficient QRD-M is based on the breadth-firstsearch method and reduces the number of nodes retained in each layer byusing a threshold. The efficient QRD-M compares node metrics of leafnodes with the threshold to discard hopeless leaf nodes and connects allbranches from the leaf nodes retained. In addition, the efficient QRD-Mcomputes the threshold by using the partial DFE in every layer while theDELTA computes the threshold by using the channel matrix only whennecessary. Like the DELTA, the SE2 employs the SE enumeration toconsider one branch at a time starting from the best branch of a leafnode. However, since the SE2 is based on the depth-first search, the SE2can consider only one leaf node at a time during the search while theDELTA considers a number of leaf nodes simultaneously during the search.FIG. 5 summarizes some of the key characteristics of various near MLdecoding schemes.

Because the computational complexity from the channel estimation andchannel decoding does not depend on the specific choice of a decoder andthe computational complexity from adders and look-up tables issufficiently small compared to that from multipliers, the relativecomplexity of decoders are normally compared by the number ofmultiplications for convenience in most of the studies. Based on thisrationale, the computational complexity of decoders for MIMO systems isobtained and evaluated by means of the number of multiplications in thisinvention also.

In the preprocessing step of the decoding, performing QRD of the channelmatrix H and computing Q^(T) y require approximately 2 m³/3 and m²multiplications, respectively. Let us next take a look at the number ofmultiplications required for each of the steps in the DELTA. Clearly, nomultiplication is needed for Steps 1, 2, and 2-1. In Step 2-2, to obtainthe threshold, we first compute the volume of the k-dimensional Voronoiregion which requires (k−1) multiplications and then multiply the volumeby a constant Γ(k/2+1)/π^(k/2). Here, we assume that the constants

$\begin{matrix}\left\{ \frac{\Gamma\left( {\frac{k}{2} + 1} \right)}{\pi^{\frac{k}{2}}} \right\}_{k = 1}^{m} & (46)\end{matrix}$are available at the receiver, and that the k-th root in (21) has thesame computational complexity as (k−1) multiplications. Then, computingthe threshold γ_(k) ² requires(k−1)+1+(k−1)+1=2k   (47)multiplications. In Step 3, when a new leaf node is generated each fromthe best and its parent nodes, (m−k+4) and 2 multiplications arerequired to decide the best branch and compute the node metric,respectively.

Let us briefly consider the minimum number of multiplications in thedecoding with the DELTA. In the best case, quite frequent at the highSNR, the DELTA searches the tree and neither backtracking nor comparisonwith the threshold occurs. In the best case, the DELTA connects only(2m−1) nodes for the decoding, and requires

$\begin{matrix}{{\frac{2}{3}m^{3}} + \frac{{3\; m^{2}} + {9\; m}}{2} - 2} & (48)\end{matrix}$multiplications in all the steps including the preprocessing step.

Since various cases occur randomly under the influence of the SNR when atree is searched in practical environment, on the other hand, theminimum number of multiplications is apparently not an adequateindicator in the examination of the computational complexity of adecoder. Therefore, we evaluate and compare the computational complexityof decoders in terms of the average number of multiplications obtainedas the average over 10⁶ Monte-Carlo runs through computer simulations.

Let us now evaluate the computational complexity of the DELTA and otherdecoders. In the simulations, it is assumed that the transmitter has nochannel state information and all symbols are transmitted with the equalenergy

$\begin{matrix}{E_{s} = \frac{E_{tot}}{N_{T}}} & (49)\end{matrix}$over a rich-scattering and flat Rayleigh fading channel, where E_(tot)is the total energy used over one symbol duration at the transmitter.Then, the SNR at each receive antenna is expressed as

$\begin{matrix}{\frac{E_{s}N_{T}}{\sigma^{2}}.} & (50)\end{matrix}$In the simulation, we evaluate and compare the BER performance andcomputational complexity of the DELTA, QRD-Stack, efficient QRD-M, SE2,and IRA for several QAM constellations and numbers of antennas. Theperformance of two representative ML decoders, the BSIDE and SD, is alsoconsidered and compared with the near ML decoders. In the evaluation ofthe computational complexity, the complexity in the preprocessing istaken into account.

FIGS. 6-9 show the variation of the BER performance and computationalcomplexity of the DELTA when the value N of leaf nodes retained varies.It is interesting to see that the computational complexity of the DELTAbarely increases while the BER performance improves considerably whenthe value of N increases. Specifically, in both 16- and 64-QAM, N=8 and32 would be quite reasonable choices when N_(T)=N_(R)=2 andN_(T)=N_(R)=4, respectively.

FIG. 10 shows the BER performance as a function of the SNR for variousnumbers of antennas and sizes of signal constellation. In the DELTA,efficient QRD-M, and QRD-Stack, the number N of nodes retained is setequally to L in the L-QAM constellation so that the efficient QRD-M andQRD-Stack can also exhibit near ML performance. It is observed that theDELTA and other near ML decoders have almost the same BER performance,all very close to the optimal BER performance.

FIGS. 11-14 show the average number of multiplications plotted as afunction of the SNR for various numbers of antennas and sizes of signalconstellation, where the initial radius for the SD was obtained by theDFE algorithm. The solid, dashed, and dotted lines are used to signifydecoders based on the metric-, breadth-, and depth-first search methods,respectively. We can observe that the DELTA generally has lowercomputational complexity than other near ML decoders in terms of theaverage number of multiplications. In addition, the gain in thecomputational complexity of the DELTA is more noticeable at low SNR andis quite robust to the variations of the size of signal constellationand SNR. Although the QRD-Stack is also based on the metric-firstsearch, the computational complexity of the QRD-Stack increases as thesize of signal constellation increases, especially at low SNR.

It is interesting to observe that the DELTA incurs the highestcomputational complexity when the SNR is moderate while other decodersexhibit higher computational complexity when the SNR is lower. Apossible explanation for this is as follows. First, as the SNR decreaseswith the signal power fixed, the average length of a branch would getlonger with the noise variance, while the threshold γ_(k) ² is notinfluenced by the SNR. Thus, the number of branches with lengths longerthan the threshold would increase and more branches would be discardedas the SNR decreases, resulting in reduced computational complexity ofthe DELTA. Secondly, as the SNR increases with the signal power fixed,the average length of the best branch of a best node would get shorter,and therefore, the child node of the best node would have a higherprobability of becoming a new best node. In other words, as the SNRincreases, best nodes would be generated mostly in the forward directionwith less ‘side’ branches and the search would reach the first layersooner, resulting in lower computational complexity. Due to these twoconflicting characteristics, the DELTA exhibits the highestcomputational complexity when the SNR is moderate.

The DELTA can in addition allow flexible tradeoffs between the BERperformance and computational complexity by changing the thresholdadaptively. FIGS. 15 and 16 show the variation of the BER performanceand computational complexity, respectively, of the DELTA when thethreshold is multiplied by a positive constant α for N=16,N_(T)=N_(R)=4, and 16-QAM. As α gets close to 0, the BER performancedeteriorates and the computational complexity decreases (which, by theway, becomes eventually almost constant at all SNR). Clearly, becausethe BER performance of the DELTA is already very close to the MLperformance when α=1, the gain in the BER performance is negligible whenα>1 and adaptation of the threshold with α>1 is not useful.

1. The method of decoding received signals in MIMO systems, the methodperformed by an apparatus and comprising: calculating, by the apparatus,a threshold by approximating a k-dimensional Voronoi regionν(Λ_(k)(R_(k)),s _(k)) by a k-dimensional hypersphere with the samevolume, where a volume of the k-dimensional Voronoi region is obtainedby using the following equations,${{Vol}\left( {V\left( {{\Lambda_{k}\left( R_{k} \right)},{\underset{\_}{s}}_{k}} \right)} \right)} = {{Vol}\left( {V\left( {{\Lambda\left( R_{k} \right)},{\underset{\_}{s}}_{k}} \right)} \right)}$and $\begin{matrix}{{{Vol}\left( {V\left( {{\Lambda\left( R_{k} \right)},{\underset{\_}{s}}_{k}} \right)} \right)} = \sqrt{\det\left( {R_{k}^{T}R_{k}} \right)}} \\{{= {\prod\limits_{i = 1}^{k}\;{r_{i,i}}}},}\end{matrix}$ where, ν(Λ_(k)(R_(k)),s _(k))={r _(k)′ε

^(k):∥r _(k)′−R_(k) s _(k)∥≦∥r _(k)′−R_(k) {tilde over (s)} _(k)∥,∀R_(k) {tilde over (s)} _(k)εΛ_(k)(R_(k))} is the Voronoi region of thelattice point R_(k) s _(k) in the lattice Λ_(k)(R_(k)),

is a set of real numbers, Λ_(k)(R_(k))={R_(k) s _(k):s _(k)εA^(k)} isthe k-dimensional finite lattice generated by a k-dimensional submatrixR_(k) of R, A={−(√{square root over (L)}−1)/2,−(√{square root over(L)}−3)/2, . . . ,(√{square root over (L)}−3)/2, (√{square root over(L)}−1)/2} is the signal constellation of L-QAM with L=4, 16, . . . ,Λ_(k) (R_(k))={R_(k) s _(k):s _(k)εA^(k)} is a k-dimensional infinitelattice generated by the k-dimensional submatrix R_(k) of R,A_(∞)={a+0.5:aε

},

is a set of all integers, and r_(j,i) is and i-th diagonal element ofthe upper triangular matrix R=[r_(i,j)] when a channel matrix H is QRdecomposed.
 2. The method of claim 1, wherein the apparatus includes adecoder which is used to calculate the threshold.